feat: add parallel distance computation and vectorized pipeline by vc1492a · Pull Request #43 · vc1492a/PyNomaly

vc1492a · 2020-09-17T15:54:31Z

Summary

This PR addresses #36 by implementing parallelization and vectorization of PyNomaly's distance computation and pipeline, rebased onto the current v0.3.5 codebase.

Changes

Vectorized kNN distances: Replaced the O(n²) Python nested loop with chunked NumPy broadcasting (+ optional scipy.spatial.distance.cdist), yielding significant speedups without new required dependencies
n_jobs parameter: Added cross-cluster multiprocessing via concurrent.futures.ProcessPoolExecutor. Set n_jobs=-1 to use all CPU cores. Follows the scikit-learn convention
Numba parallel mode: Restructured the Numba path with non-generator kernels using numba.prange for proper thread-level parallelism (the previous generator-based approach was incompatible with Numba's parallel mode)
Optional scipy acceleration: Uses scipy.spatial.distance.cdist for distance computation and scipy.special.erf for the error function when scipy is available, with graceful fallback to pure NumPy
Vectorized pipeline: Replaced Python for loops in _standard_distances, _prob_distances, and _norm_prob_outlier_factor with vectorized NumPy operations
Progress bar preserved: Progress bars work across all execution modes (sequential, parallel, Numba) with chunk-level or cluster-level granularity

API

Fully backward-compatible. The only addition is the optional n_jobs parameter (default 1):

loop.LocalOutlierProbability(data, n_jobs=-1).fit()

All existing function calls, examples, and usage patterns continue to work unchanged.

Testing

All 26 existing tests pass unchanged
3 new tests added: test_n_jobs_equivalence, test_n_jobs_single_cluster, test_n_jobs_invalid

Closes #36

coveralls · 2020-09-17T15:55:51Z

Pull Request Test Coverage Report for Build 142

32 of 44 (72.73%) changed or added relevant lines in 1 file are covered.
11 unchanged lines in 1 file lost coverage.
Overall coverage decreased (-6.2%) to 93.188%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
PyNomaly/loop.py	32	44	72.73%

Files with Coverage Reduction	New Missed Lines	%
PyNomaly/loop.py	11	93.19%

Totals
Change from base Build 126:	-6.2%
Covered Lines:	342
Relevant Lines:	367

💛 - Coveralls

vc1492a · 2020-09-17T15:56:06Z

On IBM Power8:

(venv-pynomaly) vconstan@SNA-MINSKY-N03:~/projects/PyNomaly$ python examples/numba_speed_diff.py
/home/vconstan/projects/PyNomaly/PyNomaly/loop.py:518: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function _compute_distance_and_neighbor_matrix failed at nopython mode lowering due to: scipy 0.16+ is required for linear algebra

File "PyNomaly/loop.py", line 537:
    def _compute_distance_and_neighbor_matrix(
        <source elided>
                diff = clust_points_vector[p[0]] - clust_points_vector[p[1]]
                d = np.dot(diff, diff) ** 0.5
                ^

During: lowering "$88call_method.23 = call $82load_method.20(diff, diff, func=$82load_method.20, args=[Var(diff, loop.py:536), Var(diff, loop.py:536)], kws=(), vararg=None)" at /home/vconstan/projects/PyNomaly/PyNomaly/loop.py (537)
  @staticmethod
/home/vconstan/.conda/envs/venv-pynomaly/lib/python3.8/site-packages/numba/core/object_mode_passes.py:177: NumbaWarning: Function "_compute_distance_and_neighbor_matrix" was compiled in object mode without forceobj=True.

File "PyNomaly/loop.py", line 519:
    @staticmethod
    def _compute_distance_and_neighbor_matrix(
    ^

  warnings.warn(errors.NumbaWarning(warn_msg,
/home/vconstan/.conda/envs/venv-pynomaly/lib/python3.8/site-packages/numba/core/object_mode_passes.py:187: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.

For more information visit http://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit

File "PyNomaly/loop.py", line 519:
    @staticmethod
    def _compute_distance_and_neighbor_matrix(
    ^

  warnings.warn(errors.NumbaDeprecationWarning(msg,

vc1492a · 2020-09-17T16:26:02Z

The above issue on IBM Power8 was related to an environmental error (scipy was not installed). Since scipy is needed for numba, this has now been reflected as an optional requirement in readme.md.

No Parallelization, only Numba JIT

Numba JIT with Parallelization

🚀 🚀 🚀

vc1492a · 2020-09-17T17:31:16Z

Given that there is a trade-off between the number of cores to utilize in parallel computation and communication between the parallel threads, it may be nice to allow users to set the number of concurrent threads to execute in parallel.

This seems to be set through a Numba environmental variable, and may be worth exploring adding as an additional, optional parameter when executing distance calculations in parallel: https://numba.pydata.org/numba-doc/latest/user/threading-layer.html#setting-the-number-of-threads

vc1492a · 2020-09-17T23:15:37Z

Added a num_threads parameter that can be used to specify the number of threads. So far, adding more threads - at least with how the parallelism is currently implemented - seems to slow down computation time when processing 25,000 values.

[ ================================================================================ ] 100.00%
Computation took 94.4145040512085 seconds with Numba JIT with parallel processing, using 1 thread.
[ ================================================================================ ] 100.00%
Computation took 114.98689579963684 seconds with Numba JIT with parallel processing, using 2 thread.
[ ================================================================================ ] 100.00%
Computation took 139.79329085350037 seconds with Numba JIT with parallel processing, using 3 thread.
[ ================================================================================ ] 100.00%
Computation took 168.51009488105774 seconds with Numba JIT with parallel processing, using 4 thread.

More investigation is needed to see if the above behavior is machine-specific or code related, but we now have the ability to parallelize distinct portions of the code and set the number of threads as well when using numba.

vc1492a · 2020-09-18T03:57:18Z

Results from another machine:

[ ================================================================================ ] 100.00%
Computation took 34.91723585128784 seconds with Numba JIT with parallel processing, using 1 thread(s).
[ ================================================================================ ] 100.00%
Computation took 32.24922227859497 seconds with Numba JIT with parallel processing, using 2 thread(s).
[ ================================================================================ ] 100.00%
Computation took 30.427764892578125 seconds with Numba JIT with parallel processing, using 3 thread(s).
[ ================================================================================ ] 100.00%
Computation took 30.22746515274048 seconds with Numba JIT with parallel processing, using 4 thread(s).

vc1492a · 2020-10-01T14:43:27Z

[ ================================================================================ ] 100.00%
Computation took 50.41339111328125 seconds with Numba JIT with parallel processing, using 1 thread(s).
[ ================================================================================ ] 100.00%
Computation took 64.93466305732727 seconds with Numba JIT with parallel processing, using 2 thread(s).
[ ================================================================================ ] 100.00%
Computation took 59.55153703689575 seconds with Numba JIT with parallel processing, using 3 thread(s).
[ ================================================================================ ] 100.00%
Computation took 60.493231773376465 seconds with Numba JIT with parallel processing, using 4 thread(s).
[ ================================================================================ ] 100.00%
Computation took 62.03501510620117 seconds with Numba JIT with parallel processing, using 5 thread(s).
[ ================================================================================ ] 100.00%
Computation took 62.178765058517456 seconds with Numba JIT with parallel processing, using 6 thread(s).
[ ================================================================================ ] 100.00%
Computation took 65.13408589363098 seconds with Numba JIT with parallel processing, using 7 thread(s).
[ ================================================================================ ] 100.00%
Computation took 65.27309513092041 seconds with Numba JIT with parallel processing, using 8 thread(s).
[ ================================================================================ ] 100.00%
Computation took 62.19127082824707 seconds with Numba JIT with parallel processing, using 9 thread(s).
[ ================================================================================ ] 100.00%
Computation took 59.75213074684143 seconds with Numba JIT with parallel processing, using 10 thread(s).
[ ================================================================================ ] 100.00%
Computation took 57.64805293083191 seconds with Numba JIT with parallel processing, using 11 thread(s).
[ ================================================================================ ] 100.00%
Computation took 56.80255579948425 seconds with Numba JIT with parallel processing, using 12 thread(s).
[ ================================================================================ ] 100.00%
Computation took 55.80128788948059 seconds with Numba JIT with parallel processing, using 13 thread(s).
[ ================================================================================ ] 100.00%
Computation took 56.00968599319458 seconds with Numba JIT with parallel processing, using 14 thread(s).
[ ================================================================================ ] 100.00%
Computation took 56.198336124420166 seconds with Numba JIT with parallel processing, using 15 thread(s).
[ ================================================================================ ] 100.00%
Computation took 57.532896995544434 seconds with Numba JIT with parallel processing, using 16 thread(s).

Results from another run.

medvidov · 2020-10-03T03:59:45Z

Results from another machine (4 core CPU, running from WSL):

[ ================================================================================ ] 100.00%
Computation took 51.52172231674194 seconds with Numba JIT with parallel processing, using 1 thread(s).
[ ================================================================================ ] 100.00%
Computation took 54.880839347839355 seconds with Numba JIT with parallel processing, using 2 thread(s).
[ ================================================================================ ] 100.00%
Computation took 55.5437228679657 seconds with Numba JIT with parallel processing, using 3 thread(s).
[ ================================================================================ ] 100.00%
Computation took 54.710304260253906 seconds with Numba JIT with parallel processing, using 4 thread(s).
[ ================================================================================ ] 100.00%
Computation took 56.60258507728577 seconds with Numba JIT with parallel processing, using 5 thread(s).
[ ================================================================================ ] 100.00%
Computation took 55.15400314331055 seconds with Numba JIT with parallel processing, using 6 thread(s).
[ ================================================================================ ] 100.00%
Computation took 55.54375123977661 seconds with Numba JIT with parallel processing, using 7 thread(s).
[ ================================================================================ ] 100.00%
Computation took 54.39351201057434 seconds with Numba JIT with parallel processing, using 8 thread(s).
'''

vc1492a · 2021-02-03T18:30:55Z

Refactored how the processing is handled so that we see a speed improvement when using Numba and upping the number of cores. Once I handle the below issue, I'll report back with some numbers in regards to speed of computation.

To accomplish multi-core processing, this necessitated changes in the progress bar, which is still a work in progress. One of the key challenges currently is to flush the stdout in such a way that is compatible with Numba. While print statements are supported with Numba compiled functions, it doesn't seem that sys.stdout.flush() is supported.

vc1492a · 2024-04-29T19:20:27Z

Placing this issue on hold while other repository issues are resolved - this is low priority and can be resolved at a later time.

Updated `readme.md` to update the total number and monthly number of package downloads.

chore: remove Python 3.6 and 3.7 support

chore: update readme.md with another core library example

feat: refactor Validation class for ease of use

readme

Rewrite the distance computation engine from scratch on top of v0.3.5: - Vectorized kNN distances using NumPy broadcasting with chunked processing for memory efficiency and progress bar support - Add n_jobs parameter for cross-cluster multiprocessing via concurrent.futures (n_jobs=-1 uses all cores) - Restructure Numba path with non-generator kernels that support numba.prange for thread-level parallelism - Optional scipy.spatial.distance.cdist and scipy.special.erf acceleration when scipy is available - Vectorize _standard_distances, _prob_distances, and _norm_prob_outlier_factor pipeline methods - Fully backward-compatible: all existing API calls work unchanged Closes #36 Made-with: Cursor

Update version across loop.py, setup.py, and README badge. Add changelog entry documenting all new features and improvements. Made-with: Cursor

vc1492a added enhancement New feature of request in progress This issue is being actively worked on labels Sep 17, 2020

vc1492a self-assigned this Sep 17, 2020

vc1492a mentioned this pull request Sep 17, 2020

parallelize #36

Open

vc1492a added the help wanted Extra attention is needed label Sep 18, 2020

vc1492a added on hold This issue to be resolved at a later time and removed in progress This issue is being actively worked on labels Apr 29, 2024

vc1492a added the low priority This issue is a lower priority relative to other open issues label Apr 29, 2024

vc1492a and others added 9 commits August 13, 2025 09:21

Merge pull request #80 from vc1492a/dev

323ae22

Updated `readme.md` to update the total number and monthly number of package downloads.

Merge pull request #82 from vc1492a/dev

f292eb4

chore: remove Python 3.6 and 3.7 support

Merge pull request #84 from vc1492a/dev

8477f68

chore: update readme.md with another core library example

feat: refactor Validation class for ease of use

c18f3bf

fix: scalar position assignment

155ba8a

Merge pull request #85 from vc1492a/69-refactor-validation-class

5700c07

feat: refactor Validation class for ease of use

readme

969d910

Merge pull request #86 from vc1492a/readme

d0ace6f

readme

vc1492a force-pushed the feature/numba_parallel branch from 5632d31 to 5e93be2 Compare March 20, 2026 18:12

vc1492a changed the title ~~[WIP] - Feature/numba parallel~~ feat: add parallel distance computation and vectorized pipeline Mar 20, 2026

vc1492a changed the base branch from dev to main March 20, 2026 18:12

vc1492a changed the base branch from main to dev March 20, 2026 18:13

chore: bump version to 0.4.0 and update changelog

8dd865a

Update version across loop.py, setup.py, and README badge. Add changelog entry documenting all new features and improvements. Made-with: Cursor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add parallel distance computation and vectorized pipeline#43

feat: add parallel distance computation and vectorized pipeline#43
vc1492a wants to merge 10 commits intodevfrom
feature/numba_parallel

vc1492a commented Sep 17, 2020 •

edited

Loading

Uh oh!

coveralls commented Sep 17, 2020 •

edited

Loading

Uh oh!

vc1492a commented Sep 17, 2020 •

edited

Loading

Uh oh!

vc1492a commented Sep 17, 2020

Uh oh!

vc1492a commented Sep 17, 2020

Uh oh!

vc1492a commented Sep 17, 2020

Uh oh!

vc1492a commented Sep 18, 2020

Uh oh!

vc1492a commented Oct 1, 2020

Uh oh!

medvidov commented Oct 3, 2020 •

edited

Loading

Uh oh!

vc1492a commented Feb 3, 2021

Uh oh!

vc1492a commented Apr 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

vc1492a commented Sep 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

API

Testing

Uh oh!

coveralls commented Sep 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 142

💛 - Coveralls

Uh oh!

vc1492a commented Sep 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vc1492a commented Sep 17, 2020

Uh oh!

vc1492a commented Sep 17, 2020

Uh oh!

vc1492a commented Sep 17, 2020

Uh oh!

vc1492a commented Sep 18, 2020

Uh oh!

vc1492a commented Oct 1, 2020

Uh oh!

medvidov commented Oct 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vc1492a commented Feb 3, 2021

Uh oh!

vc1492a commented Apr 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vc1492a commented Sep 17, 2020 •

edited

Loading

coveralls commented Sep 17, 2020 •

edited

Loading

vc1492a commented Sep 17, 2020 •

edited

Loading

medvidov commented Oct 3, 2020 •

edited

Loading